Fractal analysis on internet traffic time series 
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Fractal behavior and long-range dependence have been observed in tele-traffic measurement and 
characterization. In this paper we show results of application of the fractal analysis to internet 
traffic via various methods. Our result demonstrate that the internet traffic exhibits self-similarity, 
and giving the spectral exponent (/? : 1 < /? < 2). Our analysis showed that Holder exponent (H 
: < H < 0.5) , fractal dimensions (D : 1 < D < 2) and the correlation coefficients are (p : -1/2 
< p < 0). Time-scale analysis show to be an effective way to characterize the local irregularity. 
Based on the result of this study, these two Internet time series exhibit fractal characteristic with 
long-range dependence. 

PACS numbers: 05.45-a , 05.45.Df , 05.45.Tp 
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I. INTRODUCTION 

Fractal behavior and long-range dependence have been 
observed in many phenomena, basically in the field of 
^fluctuations in physical systems such as diffusion HI [j, 
JI3L [TEL Il7| . financial time series 0, tele-traffic [l& [la. 
1171 llSl [jjj and time series of heart rate dynamic }wL Il7| 
and human gait|ld|. In this paper, we characterize the 
dynamics of internet traffic time series. We applied frac- 
tal analysis into the internet traffic time series via vari- 
ous methods, such as power-spectral analysis(PSA), de- 
trended fluctuation analysis (DFA) and time-scale analy- 
sis(TSA). 

Data to be analyzed are document sizes which 
arc transferred through Library Of Congress (LOC) 
WWW server. Two types of Internet traffic, namely 
LOC(request) and LOC(send) are examined in this pa- 
per. LOC(request) is the time series of document sizes 
which transferred into the server where LOC(send) is the 
time series of document sizes which transferred out from 
the sever. These internet traffic time series are plays 
an important role in determining the degree of smooth 
ascessing via a particular server. 

The presence of "burstiness" across an extremely wide 
range of time scale in both the time series showed 
that both of these internet traffic time series are differ- 
ent from conventinoal model for telephone traffic, i.e.. 
pure Poisson or Poisson-related formal model for packet 
traffic [Hill E IH. 



two integers, i.e.. D = 1.5 . Fractal have following two 
important properties :- 

(a) Self-similarity or self-affine. A Fractal object similar 
with other part even for different scales. This property 
namely scale-invariance which fractal object will simi- 
lar in all possible scales. Self-similar exist when the ob- 
ject show similarity under isotropic scaling, meanwhile 
sclf-affinc exist when object show similarity under an in- 
isotropics scaling. 

(b) Self-similar hierarchy structure under magnification. 
A fractal object consist complex inner structure and show 
similar geometry even under different magnatification 
scale^j. D ue to the scale invariance, a power-law be- 
havior exist in between two parameters in a fractal phe- 
nomenon , like 



f(x) cx x c 



(1) 



where f(x) is a function of a study object and c is a con- 
stant. From the example given by [20j, one can estimate 
the fractal dimension through this power-law behavior. 
Standard definition of fractional Brownian motion are in- 
troduced by Mandelbrot and Van Ness^?J an( i given by: 



B H (t) 



1 



{(t-s) H -i-(- s )}dB( S ) + 



f {t- S ) H -UB(s) 
Jo 



(2) 



with Holder exponent, < H < 1 . Fractional Brownian 
motion consist the following properties :- 



II. 



SOME PROPERTIES OF FRACTAL 



E[B H (t)] = 0, 



(3) 



Fractal characterizes the object or process by using a 
fractional geometry or simplify Fractal geometry, D . A 
fractal object can be characterize by a dimension between 



E[B H {tf 



E[B H {t)B H {s)} 



[1*1 
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(4) 



(5) 
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From Eq.(4), the correlation between increment for 
Bn(t) can be written in equation form. For fractal pro- 
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TABLE I: Different value H and D and their associated pro- 
cess 



H 


D 


correlation 


process behavior 


>0.5 


<1.5 


positive 


persistence 


=0.5 


=1.5 


zero 


Brownian motion 


<0.5 


>1.5 


negative 


anti persistence 



cesses, p can defined as, 

/ -B H (-t)B H (t) \ 
P= \ B H {tf /' (6) 

.-.p = 2 2ff " 1 -l, (7) 

Where B H (t = t ) = 0, B H (t = -t) = B H (-t), and 
Bn{t) = Bn(t). If y(t) is a fractal process with Holder 
exponent H , and then for arbitrary process with 

y(ct) 4 c ff y(t), (8) 

also is a fractal process with same statistical distribution, 
where c is a constant and c > 0. The fractal dimension, 
are given by 

D = 2 — if, (9) 

and table I give the relationships for H , Z? , correlation 
and the process behavior. 

III. POWER-SPECTRAL ANALYSIS (PSA) 



TABLE II: P,H,p and D for LOC(request) and LOC(send). 



Time series 


P 


H 


P 


D 


LOC(request) 
LOC(send) 


1.59±0.01 
1.61±0.01 


0.30±0.01 
0.31±0.01 


-0.24±0.01 
-0.23±0.01 


1.70±0.01 
1.69±0.01 



where uj m = ^; N is length of time series and spectral 
exponent , (3 characterizes the persistency. The relation- 
ship between the (3 , H and D is given by 

(3 = 2H + 1 = 5 - 2D, (14) 

Least-square best fit line are applied in the power- 
spectral to get the value of (3 . PSA only provide the 
value of global Holder exponent , H since Fourier trans- 
form using harmonic function. PSA was a conventional 
methods in fractal analysis since it convenient to estimate 
the value of H 

RESULT (A) 

The power-spectral exponent (3 , Holder exponent H 
, fractal dimension D and correlation coefficient p of 
the LOC(request) and LOC(send) estimated with PSA 
method, and tabulated in table II. And also Fig. 1 show 
the power-spectral for the LOC(request) and LOC(send) 
time series. PSA showed that these LOC(request) and 
LOC(send) exhibit fractal characteristic with long-range 
dependence. 



A time series can be described in time domain as x(t) 
and also in frequency domain in term of Fourier trans- 
form as X(lu) where u) is frequency. The autocorrelation 
function of a non-stationary time series is given as, 

/ + oo 
E[x{t)x{t + r)]dt, (10) 
-oo 

and the Fourier transform of this autocorrelation function 
same with |X(w)| 2 , therefore the power-spectral density 
of a time series can be written as, 

S(u) = |A»| 2 , (11) 

also Wiener-Kintchine theorem expresses the relation- 
ship between the Fourier transform of the autocorrela- 
tion function and power-spectral density of a time series, 
as 

R xx «— > S(u), (12) 

The power-spectral function provide an important pa- 
rameter which characterize the persistency in time series. 
For a self-affine time series, the power-spectral obey the 
frequency based power-law behavior, and given by 

S m (u) ~ w- /3 ,m= l,2,...,y, (13) 



IV. DETRENDED FLUCTUATION ANALYSIS 
(DFA) 

Detrended fluctuation analysis (DFA) has been widely 
used to determine mono-fractal scaling properties and 
long-range dependence in noisy, nonstationary time se- 
ries. DFA is used to estimate the root-mean-square fluc- 
tuation of an integrated and detrented time series (a 
modified root-mean-square analysis of random walk), and 
had the capability of detection of long range dependence. 
The mathematical form of the integrated time series Y(i) 
is denoted asQ, 

i 

Y(i) ee £> fc - <x>];i = l,....,N, (15) 

fc=i 

where Xk is fc-sequence of the time series, and < x > is 
the average of the time series of length N. 
Next, Y(i) is deviated into N s = int— non-overlapping 
segments of equal length s as shown in Fig. 2. Since, 
the length of the time series is often not a multiple of 
time scale s, a short part at the end of the integrated 
time series may remain. To overcome this problem, the 
same procedure is repeated starting from the opposite 
end, and the remain part of the time series is analyzed 
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TABLE III: category of the scaling exponent, a with different 
processes. 



Scaling exponent 


Type od processes 


< a < 0.5 

a = 0.5 
0.5 < a < 1.0 
a = 1.0 
a = 1.55 


Power-law anti-correlation 
White noise 
Long-range power-law correlation 
j process 
Brownian motion 



too. Therefore, the total segments are 2N S . After the in- 
tegrated time series is deviated into N s segments, which 
each segment has the same equal length s, a least-square 
best fit line is fitted onto the time series to obtain the 
local trend in that particular segment as shown in Fig. 2. 
The dctrcnding of the time scries is done by the subtrac- 
tion of the least-square best fit line from the integrated 
time series, and variance of each segment is calculated by 

F2 ( s >») = \ Y,{Y[{v-l)s + i]-y v m\ (16) 
1=1 

for each segment v, v — 1,..,N S and 

F 2 { Sl v) ee -\-{Y[N-{v-N s )s + i]-y v {i)}\n) 

C Z ^ 



for each segment v = N s + 1, N s + 2, ..2N S . y v {i) is the 
least-square best fit line in segment v. 
The last step of the detrending process is average over 
all segments of the time series to obtain the fluctuation 
function that given as 



F(s) equiv 



2N S 



(18) 



F(s) will increase with increasing s, and it is only defined 
for the segment length, s > 4. A log-log plot of F(s) 
versus s need to be to determine the scaling behaviors. 
Therefore, the above steps are repeated several times to 
obtain a set data of F(s) versus s as shown in Fig. 3. The 
slope of the curve shows the scaling exponent a, if the 
time series are long-range power-law correlated. Hence, 
F(s) and s can be related with a power-law relation which 
is given as 



F(s) 



(19) 



The scaling exponent can be deviated to a few category 
and is summarized in Table 111. 



RESULT(B) 

To test the accuracy of the DFA algorithm which used 
in this work, the algorithm is used to calculate the scal- 
ing exponent of three known scaling exponent generated 



TABLE IV: a of the persistence power-law process, Brownian 
motion, and anti-persistence power-law process. 



Time series 


DFA Scaling Exponent, a 


±Q 


Persistence Power-Law 
Brownian 
Anti-Persistence Power-Law 


1.79 
1.51 
1.17 


0.03 
0.09 
0.10 



TABLE V: a for the LOC(request) and LOC(send). 



Time series 




±o?i 


«2 


±Q2 


a 3 


±q3 


LOC(rcquest) 
LOC(send) 


0.63 
0.65 


0.04 
0.03 


1.08 
1.18 


0.05 
0.05 


2.01 
1.95 


0.05 
0.02 



signals, which are Brownian motion, persistence power- 
law, and anti-persistence power-law process with Holder 
exponent of H = 0.50, H = 0.80, and H = 0.20 respec- 
tively. The obtained results are shown in Table IV. The 
calculated DFA scaling exponents, a of DFA method are 
consistent with the Holder exponent for the three gener- 
ated signals, and this verified the DFA algorithm is ac- 
curate to produce the actual results. The result of graph 
F(s) versus s for three signals is shown in Fig. 4. 
The scaling exponent of Library of Congress's send- 
ing and requesting time series are estimated with DFA 
method, and the results are tabulated in Table V. The 
DFA method results show these two signals exhibit cross 
over phenomenon at the segment length of 60 and at 400 
as shown in Fig. 5. It can be noticed that scaling expo- 
nent a of these two signals are identical with each others, 
which a change from white noise (s < 60) to j process 
(s < 400), and then, to a process witha « 2.00, finally. 



V. TIME-SCALE ANALYSIS (TSA) 

The previous described methods are based on linear 
log- log plot which give only a single value of the H , these 
methods are found to be insufficient in estimating the 
locally time- varying Holder exponent, H(t). The wavelet 
approach were a powerful tool to solve this problem. The 
wavelet transform (WT) is a tool which can be func- 
tion as a mathematical microscope that can well adapted 
to reveal the hierarchy and governs the spatial distribu- 
tion of the singularities of multifractal measures. We 
only consider the continuous wavelet transform (CWT) 
in time-scale analysis in order to estimate the H(t). The 
CWT are defined as 

/+oo 
x(s)r t , a {s)ds, (20) 
-oc 

where wavelet for different scale are defined as, 



ipt,a( s ) 



\a\ 2 ip 



(21) 
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TABLE VI: Maximum, minimum value of H(t), global H 
and D for LOC(request) and LOC(send). 



Time series 


H(t)[min.] 


H(t)[max.] 


Global H 


D 


LOC(request) 
LOC(send) 


-0.49 
-0.26 


1.48 
1.15 


0.32 
0.27 


1.68 
1.73 



and a is the scaling parameter and also a oc ^ In this 
paper, we using Morlet wavelet in the TSA and scalogram 
are defined as 

r+°° r+°° f ] n 
E x = / \W x (t,a;i>)\ 2 dt—, (22) 

J — oo J — oo a 

with E x is the energy of function x. Therefore scalogram 
is a energy distribution function of a signal or time series 
in time-scale plane associated with dt^%. Considering a 
time series with uniform H , which written as 

\x{s) -x(t)\ < c\s-t\ H , (23) 

where c is a constant. Applied CWT to x(t) will form 
the equation as, 

/+oo 
\t\ H \m\dt, (24) 
-OO 

And the scalogram of this time series given by0: 

n scalo (t,a) = \W x (t,a)\ 2 ~\a\ 2H W +1 ; (25) 

when a — > 0. From Eq. (25), one can estimate the H(t) 
, and also the global H can be written as 

Hgiobai = i / H(t)dt, (26) 
1 Jo 

Thus, TSA provide global H and local H(t). Therefore 
TSA are more powerful tool compare PSA and DFA in 
fractal analysis, since most phenomena shown multifrac- 
tal scaling behaviors. 

RESULT (C) 

The scalogram allow one to estimate the local H(t) 
and global H . Fig. 6 show the graph of local H (t) for 
LOC(request) and LOC(send). The red line represented 
the global H for each time series. The result of the TSA 
for each time series are summarized into table VI. 



VI. DISCUSSION 

From the analysis results, proven that these two inter- 
net traffic time series exhibit fractal characteristics with 
long-range dependence. Therefore a previous increment 
of the time series will affect the future increment, or in 



other words both internet traffic time series behave like 
long-range memory phenomena, like most in the nature. 
Even though Fourier transform are using harmonic basis 
function and have been shown the drawback for the non- 
stationary signal processing, but PSA can be used for 
initial measurements in fractal analysis for the nonsta- 
tionary time series like the objects we are studied. From 
the PSA results, we get the value for the H = 0.30+0.01 
and 0.31+0.01 for LOC(request) and LOC(send) repeti- 
tively. 

For the DFA results, show that LOC(request) and 
LOC(send) time series exhibit crossover phenomenon 
within different segment length s. This is probably due to 
the fact that on very short times scale (starting time of re- 
questing and sending files), the internet traffic time series 
is dominated by highly uncorrelated fluctuation process. 
As the time goes on, these signals exhibit smoother fluc- 
tuation that reflect the intrinsic dynamic of many elec- 
tronic systems, which usually produce a a exponent equal 
to one, and associate with the j process like. 
Meanwhile TSA results show that these two inter- 
net traffic time series are very complicated systems 
with local H(t) cover from negative value to positive 
value, which -0.49 < H < 1.48 for LOC(request) and 
-0.26 < H < 1.15 for LOC(send). Also seen that H(t) 
for LOC (request) are more complex compare to H(t) for 
LOC(send). An explanation for the different complexity 
of the H(t) for both time series can be similar to the road 
traffic at a gateway toward a metropolitans city. For the 
LOC (request), the data are coming from hundred of mil- 
lions points at the web network into a main gate at LOC 
server, this will create an serious "traffic jam" at the 
gateway of LOC server. Furthermore exist interaction 
between one incoming signal and another incoming sig- 
nal at the gateway during the period which the incoming 
signal are overloaded, and caused the network conges- 
tion. As comparison, the LOC(send) are more regular 
because the data are transfer from the main gateway to 
hundred of millions point at the web network, this data 
transferring are more easy compare to the incoming case. 
Therefore the global H value which are getting as aver- 
age value form H(t) just an approximation, and give us 
the coarse image for the time series dynamical behavior. 
Since the H(t) for LOC(request) and LOC(send) are out 
of the range (0 < H < 1), therefore these two internet 
traffic time series can be threat as very complicated sys- 
tems and encourage the further study on its, and get a 
good quantitative description can advanced our under- 
standing of these two internet traffic time series. How- 
ever, TSA provide us extra information compare to PSA 
and DFA, since it give the local singularities multifractal 
behaviors, which allowed us to study the detail behavior 
of the complex systems such like the internet traffic time 
series. 
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VII. CONCLUSION 

In this paper, we have examined the fractal charac- 
teristics and long-range dependence in these two inter- 
net traffic time series. We examined these LOC(request) 
and LOC(send) time series by three techniques: power- 
spectral analysis(PSA), detrended fluctuation analysis 
(DFA) and time-scale analysis (TS A). Other techniques 
to examined long-range dependence, not discussed in this 
paper, include dispersional analysis |llj and maximum- 
likelihood estimator | f 2j. As summary, we find the 
following:- 

(1) PSA quantify that {(3: 1<(3 <2), (H : 0<H <0.5), (p 
: -0.5<p <0), and (D : 1<D <2). PSA showed these two 
internet traffic time series exhibit the fractal and long- 
range dependence characteristics. 

(2) We have used DFA method to analysis the network- 
ing signals, and we find out that these signals exhibit 
crossover phenomenon at the segment length of 60 and 
400. Besides, signal of requesting and sending have iden- 
tical a exponent which show white noise behavior for 
segment length of 60, j process for segment length of 
400, and a smother process {a = 2.00) for the entire sig- 
nals. 

(3) TSA quantify that ( Local H(t) : -0.5<H(t)< 1.5), 
(Global H : 0<H <0.5) and (1<D <2). TSA showed 



that LOC(request) and LOC(send) time series are two 
complicated time series with local H(t) out of the range 
in between to 1. Therefore these require advanced 
quantitative and qualitative description of these signal 
to improve our understanding of the internet traffic time 
series. In many ways, wavelets analysis are the most ef- 
fective method to perform the fractal analysis since it can 
used for data sets that's are nonstationary and can per- 
form the multifractal measurements. According the anal- 
ysis results, we showed that the long-range dependence 
and fractal characteristics exhibit in these LOC (request) 
and LOC (send) time series. As the value of H approach 
to zero, the systems became more complex. Therefore we 
suggest that further fractal analysis and modeling can be 
use in internet traffic time series to optimize the network 
utilities. 
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